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It is a classic result of Shannon that binary digits can be communicated 
with arbitrarily small error probability at any rate less than 



W log 2 (l + !jp) (bits/ sec) 



over a channel with bandwidth W and additive Gaussian noise of average 
power N, using signals of average power at most P. However, in Shannon's 
proof it is assumed that the input to the receiver is the sum of a linear 
combination of the bandlimited functions 

. a k/2w) A s in 2irW(t ~ k /2W) " °° < % < °° 

wit - k/2W) = - 2rW(t _ k/m) , ^ = ^ 

(which are of course of doubly infinite duration) and a sample function 
from an exactly bandlimited Gaussian random process. The fact that 
<p (k/2W) = for all integers k ^ plays a key role in that it implies 
the total absence of inter symbol interference. 

As a result of these assumptions, there have been some objections to the 
Shannon model in connection with the notion of rate, the fact that the re- 
ceived signals are entire functions (which are predictable for all time from 
a knowledge of their values on any interval of nonzero length) and the fact 
that it is not clear whether the performance of the model is critically depend- 
ent on the assumptions that lead to the absence of inter symbol interference. 
Since Shannon's model and his associated ingenious arguments are 
widely known and are of great interest, from the point of view of the system 
theorist, it is important to be able to prove an "insensitivity theorem" to the 
effect that if the model is modified to the extent that: (i) <po(t) is replaced 
by an approximating function <p(t) with the property that the signals are of 
average power at most P where P is approximately P, and <p(t) = for 
t < tp for some negative number t v , and (ii) the noise is approximately 
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bandlimited with bandwidth W, then, subject to some reasonable qualifica- 
tions, it is possible to transmit information, with arbitrarily high relia- 
bility, at any rate less than 



W log 2 



(■♦a- 



We prove such a theorem in this paper. In fact, we show that if the noise 
has integrable power spectral density «S(co) for which 



< inf £ S(u> + 4wWp) 



and 



A 

ft = 2W sup X) S (°> + ±irWp) < oo 

Og6J<2irW p=— oo 

(these are very weak assumptions), then any rate 
R <W log 2 (l + jp) 
is permissible if y e (0,1) such that [with the understanding that(p(0) = 1] 

i m*/»t) i< a - *») yfey 

MO 

where /3 is an important positive number that depends on R, (ft/y), P, 
and W. 

Observe that if S(u) is the ideal spectral density defined by 

S(») = ^ , I co I ^ 2wW 

= 0, I to I > 2tW 

then ft = N. 

I. INTRODUCTION 

It is a classic result 1 of Shannon that binary digits can be communi- 
cated with arbitrarily small error probability at any rate less than 

TFlogJl + Q (bits/sec) (1) 

over a channel with bandwidth W and additive Gaussian noise of 
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average power N, using signals of average power at most P. There are, 
however, some unrealistic assumptions in Shannon's argument. In 
particular, there have been some objections 2 • 3 - 4 to the Shannon model 
in connection with, for example, the notion of rate and the fact that the 
received signals are entire functions (which are predictable for all time 
from a knowledge of their values on any interval of nonzero length). 

The purpose of this paper is to focus attention on Shannon's assump- 
tions 1 and show that they can be modified so that the end result is a 
quite detailed and informative statement concerned with a much more 
realistic model* of a communication system. 

II. REVIEW OF SHANNON'S ARGUMENT 

2.1 The Capacity of the Time-Discrete Gaussian Channel 

Shannon's result for the bandlimited time-continuous channel follows 
directly from a result concerned with the following type of time-discrete 
channel. 

The channel receives one of M equally likely inputs (i.e., code words) 

every T seconds. Each input is a real w -vector X = (xi , $2 , • • • , x„) 
which satisfies 

I X | 2 ^ P T 

where | X | denotes the Euclidean norm of X and p is a positive con- 
stant independent of X. It is assumed that there exists a positive con- 
stant n, independent of T, such that n = 2iiT (with the understanding 
that we consider only values of T for which 2y.T is an integer). 

The channel output (i.e., the receiver input) corresponding to the in- 
put X is the n-vector X + Z, in which the components of the "noise 
vector" Z are independent Gaussian random variables with mean zero 
and variance 77. In its attempt to determine which of the M known code 
words was transmitted, the receiver may make an error, and we shall 
denote by p c , the probability that an error is made given that code word 
i is transmitted. 

It is assumed that the channel is used to transmit information in the 
following manner. Let a message source produce independent and equally 
likely binary digits at the rate R digits per second. Every T seconds, f 
one of 2 RT possible sequences is produced. We set M = 2 RT and we repre- 
sent each of the binary sequences by a particular code word. 

* Some different results concerning the significance of the Shannon bound (1) 
are proved in Ref. 4. In particular, there, for certain models, converse proposi- 
tions are established. 

t We consider only values of T for which RT is an integer. 
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We say that a rate R is permissible if for each e > there exists a T 
and a corresponding code such that 

max p e i ^ c. 

It has been proven that the channel capacity C, the least upper bound 
of permissible rates, is given by 



C = fi log 2 U + y~ J (bits/sec). 



It has also been proven that for R < C there exists a positive number 
= p( v ,p,n,R) such that for each T > there exists a code with the 
property that 

max pe, = exp[- j S7 1 + o(T)]. 



2.2 The Time-Continuous Bandlimited Channel 

In order to use the ideas and results outlined above in his study of the 
time-continuous bandlimited channel, Shannon considers the model 
shown in Fig. 1, with the understanding that H represents an ideal 
low-pass filter with cut-off frequency W, and z(-) denotes a sample 
function of a bandlimited Gaussian random process with mean zero and 
power spectral density 



S(u) = 



N_ 
2W 



= 0, 



| a | ^ 2irW 
I « I > 2ttW, 



where AT is a positive constant. Clearly the average power of z(- ) is N. 
As in the time-discrete case, the message source produces R binary 
digits per second, so that every T seconds one of M = 2 RT possible 
sequences is produced. Consider the iih such sequence. The coder and 
signal generator associates with this sequence a particular ?i-vector 
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Fig. 1 — Model of a Communication System. 
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X = (xi,x 2 , ■ ■ ■ , x„), where n = 2WT, and a corresponding signal 

, . ^ sin 2irW(t - k/2W) . , v 

^'S* 2,W(t-k/2W) > '«< — ■-> 

which is transmitted. This process is repeated every T seconds. It is 
assumed that 

| X | 2 ^ 2IFPT 

for each code word, so that, for each signal, as can readily be verified, 

i r u(t)\u^ p. (2) 

i •'— 00 

Insofar as a physical interpretation of (2) is concerned, the object 
on the left is the total energy of w(-) divided by the length of the 
interval [(4W)~\ (41F) -1 + T] which, considering only the instants 
t = k/2W, contains all of the samples of u (• ) that can be made nonzero. 
If (2) holds, then Shannon says that u(- ) has average power at most P. 

The received signal due to the noise and only the ?'th sequence is 
«(■ ) + z(')> sin ce tne response of H to «(• ) is w(- ). The value of this 
signal at the instant / = k/2W is 

x k + z(k/2W) for k = 1,2, •■■ ,n 

in which the z(k/2W) are independent* Gaussian random variables 
with mean zero and variance N. These sample values are the same as 
those that would have been obtained if we had not ignored the effect 
at the receiver of transmitted signals due to previous and subsequent 
sequences, since the values of such signals at t = h/2W vanish for 
k = 1,2, •■■ ,n. 

Thus, on the basis of the channel capacity result of the previous sec- 
tion, we sec that our continuous channel can process information, with 
arbitrarily high reliability, at any rate less than the capacity of the time- 
discrete channel with parameters /z = W, p — 2WP, and t\ = N, that 
is, at any rate R less than 



W log. 



K) 



2.3 Discussion 



The argument of the last section is based on the assumptions that the 
input to the receiver is the sum of a linear combination of the band- 
* The autocorrelation function of the noise vanishes for t = k/2W, k ^ 0. 
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limited functions 

V0 (t-k/2W)± 2TW(t _ k/2W) > fc4l>2 ,... 

(which are of course of doubly infinite duration) and a sample function 
from an exactly bandlimited Gaussian random process. The fact that 
(p (k/2W) = for all integers k ^ plays a key role in that it implies 
the total absence of intersymbol interference. 

As a result of these assumptions, there have been some objections 
to the Shannon model in connection with the notion of rate, the fact 
that the received signals are entire functions (which are predictable for 
all time from a knowledge of their values on any interval on nonzero 
length), and the fact that it is not clear whether or not the performance 
of the model is critically dependent on the assumptions that lead to the 
absence of intersymbol interference. 

Since Shannon's model and his associated ingenious arguments are 
widely known and are of great interest, from the point of view of the 
system theorist, it is important to be able to prove an "insensitivity 
theorem" to the effect that if the model is modified to the extent that: 
(i) <po(t) is replaced by an approximating function <p(t) with the property 
that the signals are of average power at most P where P is approximately 
P, and <p(t) = for t < t v for some negative number t v , and (it) the 
noise is approximately bandlimited with bandwidth W, then, subject 
to some reasonable qualifications, it is possible to transmit information, 
with arbitrarily high reliability, at any rate less than 

TT log, (l + 

A quite explicit theorem of this type is stated in the next section. 

III. THE MORE REALISTIC MODEL 

We now consider the system of Fig. 1 to be an approximation to the 
Shannon model described in Section 2.2. 

Here we assume that z(-) is a sample function from a Gaussian ran- 
dom process with zero mean and integrable power spectral density 
S(u>) with the property that 

00 

sup £ S(a> + IwWp) 

0^w<2xlP p=— °° 

* Shannon himself has indicated 6 that care must be taken in the physical in- 
terpretation of the result of Section 2.2. However, he does not discuss the effect of 
intersymbol interference or the effect of the departure of the noise spectrum 
from the ideal spectrum. 
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is finite. From the engineering viewpoint, this finiteness condition is a 
very weak assumption; it is certainly satisfied if there exists a constant 
K > such that S(a) £ K (1 + to 2 )" 1 for all real o. 

We again suppose that the message source produces one of M = 2 
equally likely binary sequences every T seconds. We assume that there 
is a first such sequence and that the coder assigns the code word 
{x\ , x-t , • • • , x n ) to it. After T seconds, the second sequence is assigned 
the code word (x n+ \ , z n+ 2 , • • ■ , xi n ), and so on. The integer n is equal 
to 2WT. 

The transmitted signal (i.e., the input to the channel) is assumed to 
be given by 

u(t) = Ei^« - k/2W) + £ x*t>(t - k/2W) + ... 

*-l fc=n+l 

in which \p{- ) is a real -valued function of t defined on (— co,oo ) such 
that there exists a negative constant fy, with the property that \p(t) = 
for t < Uf, . It is evident that each of the signal components (i.e., each 
sum) is associated with a particular code word, that is, with a particular 
input sequence to the coder. We note that the first signal component 
"begins" at t = t+ + (2T-F) -1 , the second at ^ + (2W)' 1 + T, and so on. 
The operator H in Fig. 1 is assumed here to be causal, linear, and 
time-invariant. Thus, the output of H is 

v(t) = E xwit - k/2W) + E x k <p(t - k/2W) + ... 

A:=l *=n+l 

in which <p(-) is the response of H to ^(-)- Since H is causal, there 
exists a negative constant t v such that <p(t) = for t < t v . 

We assume that <p(0) = 1 and that <p(- ) belongs to L 2 (i.e., is square 
integrable). We think of <p(t) as being close to 

,,* A sin27rPT< 

in the sense that both \\<p — <p [| (|| • || denotes the L 2 norm) and 
E \<f(k/2W) - <p (k/2W) | = Z \<p(k/2W) \ 

k— oo fc=— 00 

k*0 fc^O 



are small. Of course this requires that — t v be sufficiently large. 

* We may certainly take the view that ^(-) and H are approximations to the 
ideal signal <po and the ideal bandlimiting filter, respectively. However, the spe- 
cific nature of these approximations is not pertinent to our development. Observe, 
in fact, that it makes sense for us to assume here that H is an approximation to 
the ideal bandlimiting filter, but that \J>(-) is an impulse-like function. The re- 
sponse <p(-) of H to ${•) is what we wish to focus attention on. 
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It is assumed also that 

E U+;,,) 2 ^ 2WPT 

Jt=l 

for j = 0, 1, 2, • • • , so that the "average power" 

ce I 2 

If It, **+;»*>& - (* + jn)/2W] dt 

of the jth component of v ( • ) is bounded from above by P + <f, , in 
which fr — > as || ^ — ?>o || — > 0. 

The receiver, which is assumed to be in possession of the code, samples 
the signal »(• )+«(•) at the instants / = fc/2TF, k = 1,2, • • • , to 
obtain in succession the "received n-vectors" 

^i ^ ( Vl , y 2 , • • • , v„) + (2l , 22 , • • • , z n ) 

Y% = (V n+ i , V n+2 , • • • , «2») + (Zn+l , Zn+Z , "\ , «2n ) 



in which v k = v(k/2W) and z = z(k/2W). These vectors are used as 
inputs to a minimum distance decoder. Thus, for example, if 

I Fi - Xi | < min | Y x - Xj |, 

i * i 

in which {X,} denotes the set of code words, then F x is decoded as 
Xi . We denote by p ei j the maximum probability, over all possible 
sequences of input code words with the jth. code word X, , that Y, is 
not decoded as X, . We let 

A 

p ei = SUP Peij ■ 

1 

Our result (which is proved in the next section) is 
Theorem: Concerning the system described above, let 

00 

< inf £ S(u + 4wWp) 

0<o<2rW p=— « 



and 



N = 2W sup £ S(u) + 47rJ*». 
0ga><2 T W p— -« 
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Then any rate 

R <W log 2 (l + -W) (bits/sec) 

is permissible (in the sense of Section 2.1 with p ei as defined above) pro- 
vided that 7 £ (0,1) such that 

S W W)|<(l-, ! )(i)' 

where = fi[(ft/y), 2WP, W, R] is the number introduced in Section 2.1. 

Remarks: Observe that if S(u) is the ideal power spectral density de- 
fined by 

= 0, | a> | > 2tW 

then ft = N. The condition that 

< inf E <S(w + 4ttTFp) 

is certainly satisfied if 8(u) is a reasonable approximation to the ideal 
spectrum. 

If S(o)) is nonincreasing for co ^ 0, then for p = 1, 2, • ■ • , 

-. rirWp 

sup S(u + lirWp) ^ ^-jp / £(co)dco 

0gu<2 f H' 27TKV ■'i t H'ji-2tF' 

and 

, /— 4xW(p-l) 

sup S(u - -iwWp) ^ k-w / S(a)du. 

Thus, for S(u) nonincreasing for cj ^ 0, we have the bound 

ft ^ 2W sup J2 S(a + 4irWp) + - / S(a)du. 

0gu<2 T W p=— 1 T J *rW 

The exponent /3 has been estimated by Shannon. 

The basic idea of the proof of the theorem is, roughly speaking, to 
(i) treat as an additional "noise source" the departure of the samples of 
v ( ■ ) from the corresponding samples in the case of zero intersymbol- 
interference (Sublemma 1 of Section IV provides an estimate of this 
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departure), and (it) to obtain a lower bound on the channel capacity 
of the more-realistic model by comparing its error probability per- 
formance with that of a model possessing zero intersymbol-interference 
and independent Gaussian noise samples (this is done in the proof of 
Sublemma 2 of Section IV). 

IV. PROOF OF THE THEOREM 

4.1 The Discrete Channel 

Consider first a discrete channel with memory that receives one of M 
equally likely inputs (i.e., code words) every T seconds. As in Section 
2.1, each input is a real w-vector X which satisfies | X | ^ pT, n is 
equal to 2fiT, and each input represents a particular sequence of RT 
binary digits. Let (.Ti , x 2 , ■ • • , x„) denote the first code word, (x n+i , 
x n+ 2 , • • • , xin) the second code word, and so on. 

At time t = (j — 1 ) T, the receiver receives the w-vector 

Yj = [y[l + 0' - l)n],y[2 + (j - l)n], ■■■ , y\jn]\ 
in which 

y(p) = 2 Xk<p(p — fc) + zip), P = 1,2, • • • 
/t=i 

where here <p ( • ) is a function defined on the integers so that <p (0 ) = 1 
and 

00 

E l*(*OI < °°> 

and each z (p) is a Gaussian random variable with zero mean. For eachj, 
let 



and 



Z 3 = {«[1 + (j - l)n], z[2 + (J - l)w], ■ • • , z\jn\) 



Vi = {» y [l + ti - D»], v[2 + (j - l)w], • ■ • , v[jn}\. 



where 

DO 

v(p) = £ xup(p — k). 
Then Yj = Vj + Zj . 
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We assume that the receiver attempts to determine the jth. code 
word Vj by minimum distance decoding as in Section III. Let p ei denote 
the error probability associated with the transmission of code word i, 
as defined in Section III. In Section 4.3 we prove the following result, 
which we shall exploit here, concerning this channel. 

Lemma: Let Z, , as defined above, possess the property that [with 8 the 
expectation operator and (•,•) denoting the usual inner product of n- 
vectors] there exist constants e and r\ such that for every real n-vector U of 
unit length: 

< 6 ^ &\(U,Zj)\ 2 ^ v 



uniformly inj and n. Let ye (0,1). Then any rote 



is permissible (in the sense of Section 2.1) provided that 

where /3 = Piivh), P> M, R] *s the number introduced in Section 2.1. 
4.2 Completion of the Proof of the Theorem 

8 I (U,Zj) I 2 = 8 X) U k UiZ [k +U-l)n]Z[l+U-l)n] 
k,l 

= J2u k uiR[(l - k)/2W] 

k,l 

for any real n-vector U, in which 

R(r) = JL ( m S(u)e iaT dT. 

Thus, 

8 | (U,Zj) | 2 = i- £ u k ui r S(co)e Ml - k)im dc 

41T k,l *-*> 



-sf|z 

Z7T j>=— no J- 



— iwkl2W 

u k e 

2vW+4irWp 



S(u)a\ 



p =— oo «'-2*IF+4tI»'p 

2tvW 



£ 



u k e 



-iuk/m 



S(u)du 



^7T J-2xW &=1 



-t'<jfc/2IV 



E 5(« + 4*Wp)d(a. 
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It follows at once that 

<*> 1 f2irIF| n 2 

do), 
0ga)<2]rW p=— w AIT J-ivW | fc=l 

and that 



/2W 



dco. 



8 | (U,Z S ) | 2 g sup £ «(« + 4rfFp) i [ I £ ii*" 1 " 1 

and that 

8|(C7,Z y )| 2 ^ inf f; S(<* + ±TrW V )±f T]i \j:u k e- iuk 

Ogu<2jrW p=-oo Z7T J-2jrW | A:=l 

Since 

1 /.2irJPI n 2 

1 / V* — iahliW I j I tt |2 

we have 

00 

S I (U,Zj) | 2 g 2T7 sup £ £(« + 4tt1Fp) 

0gw<2irH' p=— oo 

S I (tf,Z y ) | 2 ^ 2TF inf f] #(co + IwWp) 

Ogt)<2»-H' p=— oo 

for | U | = l, independent of j and n. Thus, we may view the time 
continuous system of Section III as a discrete-time communication 
system of the type described at the outset of this section with ju = W, 
p = 2WP, 



and 



c = 2W inf X s (<» + 4ttWp), 

0<u<2tW p=— oo 



V = 2W sup £ S(u + 47rTTp). 

0<w<2irlF p=»-oo 



This proves the theorem. 

4.3 Proof of the Lemma 

With Xk as defined in Section 4.1, let 



V j = [X[l + U-l)n] , X[2+U-l)n) , • * ' , X[jn)}. 



Sublemma 1: 



| V, - Vj | 2 ^ 2 P r / ^ | p(k) |Y 
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Proof: 



v, - f , I 2 = E 

p=l+(j-l)n 
in 

= E 

p=l+(,-l)n 



Yj x k <p(p — k) - X p 

A = l 

2 x k (p(v — fc ) 



in which x k = for A: < 1, 0(0) = 0, andp(fc) = <p(k) fork ^ 0. There- 
fore, 

I Vj ~Vj\ 2 = Z\Z *(,-«*(*) I 2 , 
p I * I 

and, by the Schwarz inequality, 

I Vi - V, I 2 s E E I *(^> I 2 - 1 *(*) I Z I *<*) I 

p k k 



Since 



we have 



^ E I £( fc ) I Z I *(*-*> I 2 E I ^( /i; ) I- 

Apt 

E I .r (p _,) | 2 ^ 2pT, 



| 7, - y,| 2 ^ 2 P 7'/f: |p(fc) 

I A=-co 
\ A-^0 

which is the assertion of Sublemma 1 . 

Therefore, with Yj and Z, as defined in Section 4.1, we have 

Yj = Vj + Ei + Z; 

in which 

l^l 2 ^ 2pTQ>(fc)) 2 . 

This fact when combined with the following result* proves the lemma. 

Sublemma 2: Consider a time-discrete channel of the type described in 
Section 2.1. Replace Z by the n-vector (E + Q) in which E is a fixed vector 
and the components of Q are Gaussian random variables with zero mean 
with the property that there exist constants e and j\ such that for every real 
n-vector U of unit length: 

< e ^ E|(C/,Q)| 2 ^ v 
* See Ref . 3, Appendix D, for a result related to Sublemma 2. 
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uniformly in n. Let ye (0,1). Then any rate 

R < n log, (l + p-J 

is permissible (in the sense of Section 2.1 ) provided that 

\E\ 2 ^ #T 
for all T > 0, in which 

#<2(l- 7 *) 2 ^ 
7 

where (1 = 0(y/y, p, n, R) is the number introduced in Section 2.1. 

Proof: Let To e (0, °o ). Consider the time-discrete channel of Section 2.1 

with noise vector Z, but with -q replaced with (1/7)77. Here for 



R < M log 2 (l + ^j 



and T ^ T , there exists a code {X,} such that X, ?* X, for i 9^ j, 
and the error probability (using minimum distance decoding) given 
that the tth code word was transmitted 

p ei & Pr U {|Z, + Z-X y | ^ \Z\\ 
is at most exp [— 0T -\- 6(T)] independent of i, where 

= P[(v/y), p, m, R] 

and 6(T)/T — > as T — > 00. For this code, the error probability (using 
minimum distance decoding) for the channel described in Sublemma 2 is 

p ei = Pr V [\Xt + E + Q- Xj\£\E + Q\). 

Let ca = I Xi — Xj I, and let Ua denote the unit-length vector 
(Xi — Xj)/dj . Then it can easily be shown that 

|x, + jB + g-Z/|js|JS + g| 

if and only if 

(PiitQ) S -|c,y- (IT*,,*), 
in which ( • , • ) denotes the usual inner product of n-vectors. Thus, 
p.i = Pr U { (Uij ,Q) £ - \a, - (Uu , E)\ 
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and similarly, 

p ei = Pr U [{Uu,Z) ^ -M. (3) 

Consider (3). Let the n-vector P = (pi , pa , • ■ • , P») represent a 
general point in Euclidean w-space 8„ , and let (R,; denote the closed 
half -space of 8„ throughout which (Ua , P) ^ -£c,-,- . Let (R< = U (R - . 

Then 

p ri = (2.r«Q""7 ffli exp[-l2t^] d2l ...^. 

Similarly, let S„- denote the closed half-space throughout which 

and let 

S, = U §„ . 

Then, since 

Vei = Pr U [{U«,y*Q) ^ -Bd/+ (C/ii,^)]7 _i }, 

we have, with A the covariance matrix of the random variables [q t y }, 

p ei = (27r)" n/2 (det A)"' f exp [-*Q'A _1 Q]dgi • • • dg„. 

Let us assume that 

B*/+ (tf«, W* ^ **i ( 4 ) 

for all j ^ *. Then S,-; £ (Ri,- , S, £ (R, , and hence 

Vei ^ (27r)"" /2 (det A) -5 f exp [SQ'aT'QW •■■ dq n . 



Let Q = EF, where E is the orthogonal matrix such that E~ A _1 E 
= diag (Xi _1 , X 2 _1 , • • • 1 An -1 ), with the understanding that Xi and X„ 
denote the smallest and largest eigenvalues of A, respectively. Then 

Pei ^ (27r)~ n/2 (X 1 X 2 • • • a b )" j J t exp I - i g X*~Vj dtfi ' ■ ' dtf« 
in which fll/ denotes the inverse image of (R, under the transformation 
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represented by S. Similarly, 

A, = (2x)-' ! (iV* / , exp ["- I 2 ± yf\ d Vl --- dy„ . 
\y/ -W L 2 77 fc-i j 

Since, by assumption, 

e ^ S|(t/,Q)| 2 ^ 9 

for every real n-vector U of unit length and every positive integer n, 
it follows that Xi ^ eY _1 and X„ ^ ipy 1 . We note that for < Xy < iff 1 '- 

provided that yf ^ 77/7. Thus, 

(27r)-" /2 (X 1 X 2 • • • X„r J f exp |~- \ £ xrVl dyi ■ • ■ dy n 

^ (27T)-"' 2 (Vf 2 / exp f-Il± </ fc 2 ] ^ • • • dz/„ 



^ A,-, 



in which Q denotes the hypercube in 8„ defined by the inequalities: 
Vi ^ v/y for j = 1, 2, • • • , n. 
Therefore, 



P« ^ 



S p ei + (2tt)-" /2 (X 1 X 2 • • • X n )" 4 f exp \- i E xr 1 



-1 2 
2/fc 



rf?/i • • • d?/» • 



However, 



(2?r) " /2 (XiX 2 • • • X„) _i / exp - - XI ^k l Vk diji ■ ■ • dy„ 
J e L 2 i=i 

= IT (2r)"V* / c-***" 1 ""^ 

*=1 J-lj/7 



^ r, 



in which 



-^Cy^X-^)*- 
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Thus, 

Pa ^ Pa + r" ^ exp [-0T + 9(T)) + r 2 " r . (5) 

Since r < 1, the right-side of (5) approaches zero as T — > *>. Therefore, 
to complete the proof of Sublemma 2, it suffices to show that there 
exist values of T such that (4) is satisfied (for all j ^ i) for all T ^ T . 
We note first that (4) is satisfied if 

-(U iit E) £i(l - 7 *)c« (6) 

for alii 5* i. Since - (17*,- , #) ^ | E |, (6) is satisfied if 

\E\^\{\ -y")cii (7) 

for all j 9^ i. 

We now estimate the numbers c,-y . We have, 4 with a = hcaiy/vY, 

exp [- /37 1 + 9(T)] ^ p ei ^ Pr{(U ih Z) ^ -£c, 7 } 

= (2x)" 5 [ ° e~ ix2 dx, 

for any i and any j ?* i, since the variance of (C/,y , Z) is rj/y. There- 
fore, 



[-/3T + 0(T)] ^ (2tt) " 5 [ e- ix \lx = (2tt) * / 

J— co «'a 2 /2 

■e^^y^dy. (8) 



Let 5 > be a constant, and let a (5) denote the smallest nonnegative 
number such that 

(2yy h ^ e~ iv for all y ^ a (8). 



Then 



exp [-0!T + 0(T)] ^ (2tt) j f exp [-(1 + h)y\dy 

•V/2 



a*/2 
-\ /, , „n-1 r i /i , P \_.2i 



^ (2ttP(1 + 5) _1 exp [-HI + «K] 

for a 2 ^ 2a (5), from which it follows at once that 

a 2 ^ 2(1 + 8)-^T - 2(1 + 5r ] {ln [(2t)*(1 + 5)] + 0(T)} 

for a 2 ^ 2a (5). Since exp [-07 + 6{T)\ -* as T -+ oo, we see from 
(8) that for each a (5) > 0, there exists a constant T 6 > such that 
a 2 > 2a (5) for all T ^ T&. Thus, for each 8 > there exists aTjc (0, oo ) 
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such that 

*/ ^ 8(1 + a)~Wr 

(9) 
- 8(1 + l)-S"*f{Jn [(2tt)*(1 + 5)] + 6{T)) 

for all T ^ T». 

Inequality (7) is therefore satisfied for all T ^ To if T > T« and 

|#| 2 ^2(1 - 7 ') 2 (1 +8)- l y-\fiT 

- 2(1 - V) 2 (l + 5)- 1 7 -\{ln[(27r) 5 (l + 8)] + 0(T)} 
for all T £ To . By assumption: | 2? | z ^ #T for all T > 0, in which 

& < 2(1 -7*)W. 
Choose 5 > so that 

i? < 2(1 - 7 ') 2 (i +«rV 1 n/s, 

and then let To e [T s , <*>) be so large that 

# £ 2(1 - t*) 2 (i + srw 

- 2(1 - 7 ') 2 (1 + ^rV-VT-'jln [<2*)*(1 + 5)] + 0(T)} 

for all T ^ To. Then (10) is satisfied for all T ^ T . This completes 
the proof of Sublemma 2. 

V. FINAL REMARKS 

The writer is indebted to D. Hamming and L. A. Shepp for discus- 
sions concerning this work, and to J. Savage, D. Slepian, and A. Wyner 
for commenting on the draft. 
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